Inconsistent Data Cleaning Based on the Maximum Dependency Set and Attribute Correlation

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Research of Data Cleaning Methods Based on Dependency Rules

This paper introduces the concept and principle of data cleaning, analyzes the types and causes of dirty data, and proposes several key steps of typical cleaning process, puts forward a well scalability and versatility data cleaning framework, in view of data with attribute dependency relation, designs several of violation data discovery algorithms by formal formula, which can obtain inconsiste...

متن کامل

Approaches to attribute reductions based on rough set and matrix computation in inconsistent ordered information systems

In order to conduct classification analysis in inconsistent ordered information systems, notions on possible and compatible distribution reductions are proposed in this paper. The judgement theorems and discernibility matrices associated with the two reductions are examined, from which we can obtain an approach to the two reductions in rough set theory. Furthermore, the dominance matrix, possib...

متن کامل

Routing Attribute Data Mining Based on Rough Set Theory

QOSPF (Quality of Service Open Shortest Path First) based on QoS routing has been recognized as a missing piece in the evolution of QoS-based services on the Internet. Data mining has emerged as a tool for data analysis, discovery of new information, and autonomous decision making. This article focuses on routing algorithms and their applications for computing QoS routes in OSPF protocol. The p...

متن کامل

Cleaning the GenBank Arabidopsis thaliana data set.

Data driven computational biology relies on the large quantities of genomic data stored in international sequence data banks. However, the possibilities are drastically impaired if the stored data is unreliable. During a project aiming to predict splice sites in the dicot Arabidopsis thaliana, we extracted a data set from the A.thaliana entries in GenBank. A number of simple 'sanity' checks, ba...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Symmetry

سال: 2018

ISSN: 2073-8994

DOI: 10.3390/sym10100516